noisy validation
Set a Thief to Catch a Thief: Combating Label Noise through Noisy Meta Learning
Wang, Hanxuan, Lu, Na, Zhao, Xueying, Yan, Yuxuan, Ma, Kaipeng, Keong, Kwoh Chee, Carneiro, Gustavo
X, SEPTEMBER XXXX 1 Set a Thief to Catch a Thief: Combating Label Noise through Noisy Meta Learning Hanxuan Wang, Na Lu, Xueying Zhao, Y uxuan Y an, Kaipeng Ma, Kwoh Chee Keong, Gustavo Carneiro Abstract --Learning from noisy labels (LNL) aims to train high-performance deep models using noisy datasets. Meta learning based label correction methods have demonstrated remarkable performance in LNL by designing various meta label rectification tasks. However, extra clean validation set is a prerequisite for these methods to perform label correction, requiring extra labor and greatly limiting their practicality. T o tackle this issue, we propose a novel noisy meta label correction framework STCT, which counterintuitively uses noisy data to correct label noise, borrowing the spirit in the saying " S et a T hief to C atch a T hief". The core idea of STCT is to leverage noisy data which is i.i.d. with the training data as a validation set to evaluate model performance and perform label correction in a meta learning framework, eliminating the need for extra clean data. By decoupling the complex bi-level optimization in meta learning into representation learning and label correction, STCT is solved through an alternating training strategy between noisy meta correction and semi-supervised representation learning. Extensive experiments on synthetic and real-world datasets demonstrate the outstanding performance of STCT, particularly in high noise rate scenarios. STCT achieves 96.9% label correction and 95.2% classification performance on CIF AR-10 with 80% symmetric noise, significantly surpassing the current state-of-the-art. I NTRODUCTION D EEP Deep learning has achieved great success in various fields, attributed to the availability of carefully annotated large scale datasets [1], [2], [3]. However, the collection of high quality datasets generally comes with high annotation cost and intensive human intervention, creating a significant obstacle to the development of deep learning. Fortunately, the annotation cost issue can be mitigated through web crawling [4] and crowdsourcing [5]. However, such low-cost datasets often contain a considerable amount of noisy labels, which may lead to severe overfitting of neural networks and performance degradation [6].
- Asia > China > Shaanxi Province > Xi'an (0.04)
- Europe > United Kingdom > England > Surrey (0.04)
- Asia > Singapore (0.04)
Estimating the Conformal Prediction Threshold from Noisy Labels
Penso, Coby, Goldberger, Jacob, Fetaya, Ethan
Conformal Prediction (CP) is a method to control prediction uncertainty by producing a small prediction set, ensuring a predetermined probability that the true class lies within this set. This is commonly done by defining a score, based on the model predictions, and setting a threshold on this score using a validation set. In this study, we address the problem of CP calibration when we only have access to a validation set with noisy labels. We show how we can estimate the noise-free conformal threshold based on the noisy labeled data. Our solution is flexible and can accommodate various modeling assumptions regarding the label contamination process, without needing any information about the underlying data distribution or the internal mechanisms of the machine learning classifier. We develop a coverage guarantee for uniform noise that is effective even in tasks with a large number of classes. We dub our approach Noise-Aware Conformal Prediction (NACP) and show on several natural and medical image classification datasets, including ImageNet, that it significantly outperforms current noisy label methods and achieves results comparable to those obtained with a clean validation set.